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L-STATISTICS  WITH  SMOOTH  WEIGHT 
FUNCTIONS  JACKKNIFE  WELL 


by 

William  C.  Parr  and  William  R.  Schucany 
ABSTRACT 

The  behavior  of  linear  combinations  of  order  statistics  (L-statistics) 
under  jackknifing  is  discussed.  The  asymptotic  behavior  of  a  jackknifed 
L-statistic  is  produced,  and  the  pseudo-value  based  variance  estimate  is 
shown  to  be  consistent  under  moderate  smoothness  and  a  trimming  condition 
of  the  weight  function.  The  results  extend  easily  to  functions  of  L-sta- 
tistics  and  thus  answer  a  question  posed  in  Miller  (1974) .  Monte  Carlo 
results  support  small-sample  applicability  of  the  large-sample  results  for 
the  construction  of  approximate  confidence  intervals. 
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1.  INTRODUCTION,  NOTATION,  AND  DEFINITIONS 


Lee  X, ,  X„,  ...»  X  be  a  random  sample  of  size  n  from  a 
x  i  n 

distribution  with  distribution  function  F,  and  let  X,  ,  X„  , 

l,n  2,n 

...,  X  denote  the  associated  order  statistics.  For  a  fixed 
n,n 

weight  function  J(u)  defined  for  0  <  u  <  1,  we  define  the 
L-statistic 


s  M-h-k  • 

n  n  In  +  1  J  i,n 


(1.1) 


Other  definitions,  typically  asymptotically  equivalent  to  that 

1  n  i  1  n 

above,  include  T  ■  —  I  J(— )X.  and  U  *—  E  c  X. 

n  n  ,  «  n  i*ti  n  n  ,  .  l«n  x«n 

i/n 

where  c.  •  J  J(u)du.  S  is  chosen  for  study  in  this  paper 

1,0  (i-1) /n 

as  being  typical  of  actual  L-statistics  used  in  practice  (use 
of  Ur,  in  which  the  integration  "smooths"  J,  might  result  in 
fewer  conditions  on  the  weight  function) . 


L-statistics  of  the  form  of  S  are  often  used  in  esti- 

n 

nation  problems  since  they  are  typically  computationally  simple 
and  (at  least  for  location  and  scale  problems)  asymptotically 
efficient  given  the  proper  choice  of  the  weight  function  J. 
Thus,  they  are  often  good  choices  a)  as  estimates  for  their 
own  sake,  b)  as  good  starting  values  for  iterative  estimation 


procedures,  and  c)  as  quick  and  consistent  estimators  of 
nuisance  parameters  (such  as  unknown  scale  in  regression 
problems)  to  minimize  the  number  of  parameters  being  simultan¬ 
eously  estimated  via  an  iterative  procedure.  Herein  we  consider 
primarily  a),  where  L-statistics  are  used  to  make  parametric 

Inferences  on  their  own.  However,  S  is  often  biased  as  an 

1  n 

estimator  of  Sq  ■  /  Q(u)J(u)du  where  Q(u)  *  inf{x:F(x)  u)  , 

as  is  g(S  )  biased  for  g(S  )  for  many  choices  of  g.  Also, 
n  o 

there  seems  to  be  a  dearth  of  procedures  for  consistent  non- 

parametric  estimation  of  the  variance  of  /n(S  -  S  ) .  (But  see  Sen 

n  o 

(1979)  in  this  regard.)  For  both  of  these  problems,  reduction 
of  bias  and  consistent  nonparametric  variance  estimation,  the 
jackknife  is  a  natural  choice  as  a  possibly  non-optlmal 
rough-and-ready  tool . 


The  ordinary  jackknife  of  S  may  be  written  as 

n, 


S  -  -  IS. 
n  n  iml  i,n 


-  “Mr]  V» 

-  r  z  [ti  -  +  (»  -  *,  . 


(1.3) 


where  the  ith  pseudo-value  is 


n 

I  J 

j-1 


In  +  1 


X 


j.n 


£  jAx. 

j-1  n  4»n 


I  J 
j-i+1 


9 


(1.4) 

Is  the  same  L-statlstlc  computed  after  deleting  X. 

~  b  1,0 

from  the  sample,  and  £  h(i)  is  understood  to  be  defined  to  be 

i»a 

zero  if  a  >  b.  (See  Miller  (1974)  for  basic  definitions  of 
the  jackknife  and  pseudo-values  in  general.)  Note  that  the 
definition  of  J  at  0  and  1  is  completely  arbitrary  since  the 
associated  terms  will  cancel  out  in  (1.3). 

We  further  define  the  sample  variance  of  the  pseudo-values 


as 


s  2  - 
P 


n  -  1 


£  Csi  n  "  S  )2 
i-1  ±,n  n 


(1.5) 


2.  RESULTS  ON  JACKKNIFING  L-STATISTICS 

Theorem  1:  Let  X^t  X^,  . ..,  X  be  a  random  sample  of  size  n 

from  a  distribution  F  with  Ey|x|^  <  •  for  some  -j  _<  p  <,  1. 

Further,  let  S  and  S  be  as  defihed  by  (1.1)  and  (1.3).  If 
n  n 

a)  p  •  1  and  J'  satisfies  a  Holder  condition  with  a  >  -j,  or 

b)  4  £  p  <  1  and  J'  satisfies  a  Holder  condition  with 


<  ",  with  J'  satisfying  a 


If  0<t*s<  1  and  ef,x' 

Holder  condition  with  exponent  a,  -*■  0  with  probability 
one,  by  Marcinkiewicz'  theorem  (Loeve  (1977),  p.254).  If 
Ej,|x|  <  "  and  J'  satisfies  a  Holder  condition  with  exponent 
®  >  y.  0  with  probability  one,  using  the  strong  law  of 

large  numbers. 

Several  comments  are  in  order: 

1)  This  result  is  the  univariate  analogue  for  L-statistics 
of  result  J.l  of  Reeds  (1978)  for  M-estlmators.  The  extension 
to  the  multivariate  case  is  immediate  and  hence  omitted. 

2)  The  moment  condition  on  F  is  seen  by  an  inspection 
of  the  proof  to  be  superfluous  if  S  "trims”,  i.e.,  if  J(u)  ■  0 
for  u  £  (C,e)  U  (1  -  e,l)  for  some  0  <  e  <  -j  • 

3)  If  further  /n(S  -  S  )  ^  N(0,c2),  where 

n  u 

m  oo 

(J2  -  /  /  j(F(x))J(F(y))[F(min(x,y)) 

—00  *«D 

-  F(x)F(y)]dx  dy  >  0  ,  (2.1) 

then  /n(S  -  S  )  ^  N(0,o2)  likewise.  This  is  true  if  the 

Q  O 

conditions  of  Theorem  1  hold  with  p  ■  1  and  J'  is  of  bounded 


variation' (using  the  result  o£  D.  S.  Moore  (1968),  with  a2  <  «) , 


4)  Since  J”  bounded  on  (0,1)  implies  that  J'  satisfies 

a  Holder  condition  with  a  ■  1,  the  condition  on  J  and  F  in  the 

theorem  may  be  replaced  by  the  stronger  but  intuitively  clearer 

2j 

condition  that  J"  is  bounded  and  E_|x|T3  <  ®. 

F 1 

5)  A  similar  theorem  was  stated  under  much  stronger 
conditions  by  Thorburn  (1976) . 

6)  Theorem  1  yields  a  law  of  the  Iterated  logarithm  for 

the  jackknife  of  a  linear  function  of  order  statistics  from  the 

corresponding  law  for  the  original  statistics.  Using  Theorem  4 

(Example  1)  of  Wellner  (1977b),  we  obtain  that  if  E^lxl^"*"6  <  « 

1 

for  some  e  >  0  (and  a  >  j) 

1 

/n|Sn-  /  Q(u)Jn(u)du| 

lim  sup  - - -  ■  1 

n  , - 

r2  a 2  log  log  n 


with  probability  one,  where  J  (u)  ■  J  — 7— r  for  - - -  <  u  <  —  , 

n  n  t  x  n  n 

r  .  ■x 


1  <_  i  <_  n,  and  J  (0) 


7)  Similarly,  a  Berry-Esseen  rate  for  S  follows  directly 

-  n  1 

from  that  of  Helmers  (1977),  If  E_|x|  <  *  ,  /  | J' (u) |dQ(u) <  ® 

0 

and  o2  >  0,  we  quickly  obtain  that 


sup  ^  |Fn*(x)  -  *(x)| 


0(n'V2)  , 


where  F  *(x)  is  the  cumulative  distribution  of  (S  -E[S  ])/(VarS  )) 
n  non 

and  ♦  (•)  is  the  standard  normal  cumulative. 

The  following  theorem  gives  conditions  under  which  the 

jackknife  provides  a  consistent  estimator  of  the  asymptotic 

variance  of  /n(S  -  S  )  and  /n(S  -  S  ),  and  hence  makes 
no  no 

possible  the  construction  of  asymptotically  pivotal  quantities. 

Theorem  2:  Let  X^,  ...,  Xq  be  a  random  sample  of  size  n  from 
a  distribution  F,  and  let  s^2  be  defined  by  (1.5).  If  there 
exist  positive  numbers  e  and  £  such  that  0  <  5  <  e  <  y, 

(let  C(e)  -  [0,e)  U  (1  -  e,  1]);  J(u)  -  0  for  u€C(e), 

rQ(u)  ■>  B  >  0  for  u£C(e  -  5) ,  and  J* 

satisfies  a  Holder  condition  with  exponent  a  >  0,  then 

P 

s  2  -~*-o2,  with  a2  given  by  (2.1). 

P 

Proof:  Note  that  a2  ■  Var(H(U))  where  U  'v  u(0,l)  and 

1 

H(u)  »  Q(u)J(u)  -  /  <J(t)J’(t)[t  -  I(u  <_  t)]dt,  translating 

0 


the  results  of  Boos  (1979,  eq.3.3)  into  the  quantile  domain  and 
then  integrating  by  parts.  Now,  from  (1.5),  it  will  suffice  for 
the  desired  result  if  we  show  that 


(2.2) 


SUP  lsi  n  “  H  Z~T~t|  I  0  ,  as  n  -*•  ».  (2.2) 

l<i<n  1,n  ln  +  ^ 

1  n  f  <  .  n  (  .  2 

If  (2.2)  holds,  then  s  2  -  - - -  E  (H  -~-r  -  —  E  H— j— :  )  +o  (1) 

P  n  -  1  i-i  ln  +  n  j=1  ln  +  lj  P 

1  11  fil  Iafi' 

Furthermore,  - r-  I  H2  — -r-r-  and  —EH  — — r  will 

n  ’  1  i-1  +  n  ±ml  ln  +  XJ 


1  1 

converge  to  /  H2(u)du  and  /  H(u)du  respectively, giving 
0  0 

V  o 

SpZ  -*■  a2.  By  definition. 


i,n  j»l  ln  +  XJ  J»n  j«i  n  2  »r-  j*i+il  n  -i»n 


'  j 

-J—  x 

n  +  lj 

n  +  1  j  ,  n 

-•I  a 


11 


> 

i 

1  n 

V  _  _  r1 

T  » 

f  ,1  1 

,1  -  T 

\ 

j  .  i 

n  +  li 

A . - L 

i,n  n  j.J 

J 

k 

n  +  1- 

n  +  1  1 

n  +  1  -  n  +  1 

1  1 J 

+  R 


h2i  * 


using  Taylor  expansions  and  the  Hb'lder  condition  on  J*.  Also 

using  trimming  condition  on  J,  the  remainder  term  is  such  that 

sup  i R  _ . |  =o  (1)  as  n  +  », 
l<i<n  "21  “ 

From  the  continuity  and  boundedness  of  J  and  the  fact 

r 

that  fQ(u)  ^  B  >  0  wherever  J(u)  +  0,  J 
J 


[  i  1 

n 

f  1 

n  +  1 

Q 

n  +  1 

V 

n 

r  t  t 

f_J _ 1 

n  +  1 


i,n 


+  R  . . ,  with  sup  R  ,,  ■  o  (1).  Further 

n3i  Ki<n  n3i  .P 


j-1 


JL 


Ln  +  1J  n  +  1  j,n 


/  Q(t) J' (t)F(t)dt  with  probability 


one  from  Corollary  2  to  Theorem  1  of  Wellner  (1977). 

i 


Then,  for  fixed  K  let  u 
It  follows  easily  that 


i  K  +  1 


,  i«0,  1,  ...,K+1. 


L  -  -  l  J' 
n  J- [nu. 3 


(n  +  1 J  j ,n 


/  J'(t)Q(t)dt 


-  /  J'(t)I(u  <  t)Q(t)dt  . 

0 


Since  K  is  finite,  the  convergence  in  probability  is  uniform 


in  i,  i  ■  0,  1,  . K  +  1.  Furthermore,  letting 
be  achieved  at  index  value  i* , 


inf  |u  - 
i-l,...K  x 


sup  |L±* 
0<u<l 


1 

n 


n 


Z 

j"[nu] 


< 


sup  |j'(u)[ 
0<u<l 


Fn_1(e) !  +  IF^U  -  e)  | 
K  +  1 


which  may  be  made  less  than  any  specified  positive  number  in 
probability  through  a  sufficiently  large  choice  of  K.  Hence, 

p 

(2.2)  holds  and  s2  — ►  a2,  by  the  uniform  continuity  of 

j/j’CtXHtjdt.) 

Some  pertinent  comments  follow: 

1)  This  result  provides  a  method  for  consistent  variance 
estimation  for  L-statistics ,  being  the  univariate  analogue  for 
L-statistics  of  result  J.2  of  Reeds  (1978)  for  M-estlmators. 

2)  Finiteness  of  cr2  is  clearly  implied  by  the  trimming 
and  boundedness  conditions  on  J. 

3)  This  result  makes  possible  the  construction  of 

nonparametric  approximate  confidence  intervals  for 
1 

S  ■  /  Q(u)J(u)du,  using  as  pivotal  quantities 


13 


This  is,  Co  the  knowledge  of  the  authors,  the  only  nonparametric 
method  of  consistent  variance  estimation  for  L-statlstics  (l.e.. 
In  the  absence  of  a  specified  parametric  form  for  the  unknown 
density)  other  chan  that  of  Sen  (197?)  discussed  below.  For  a 
specific  parametric  family,  a  consistent  estimator  would  of 
course  typically  be  provided  by 


or2  -  /  /  J(F“(x)) J(F£(y)) [F‘(min(x,y) 

—CO  •« 

-  F“(x)F“(y)]dxdy  , 


(2.3) 


where  F  ,  0  €  n  is  the  parametric  family  of  densities  and  9 
is  a  weakly  consistent  estimator  of  0,  possibly  multivariate. 

4)  It  would  be  interesting  to  compare  the  properties  of 
this  estimator  with  that  proposed  by  Sen  (1979),  which  is 
essentially 


a2  -  /  /  J(FQ(x))J(Fn(y))[Fn(x,y)  -  Fn(x)Fn(y)]dxdy,  (2.4) 

••  •• 

obtained  by  subsituting  the  empirical  distribution  function 
F  for  F  in  (2.1). 

U 

5)  Relaxation  of  the  trimming  condition  on  J,  which 
would  require  either  moment  conditions  on  F  or  joint  conditions 
governing  J(u)  and  Q(u)  as  u  approaches  0  and  1  is  questionable  from 


■  ■ 


the  standpoint  of  robust  inference  and  hence  Theorem  2  is  . 
stated  as  being  of  interest  in  its  own  right.  Theorem  3, 


however,  obtains  strong  convergence  of  s^2,  while  dropping  the 
trimming  condition,  at  the  cost  of  moment  assumptions. 

Theorem  3:  Let  X^,  ...,  X^  be  a  random  sample  of  size  n  from 
a  distribution  F  with  E|x|2+e  <  •  for  some  e  >  0.  If  J' 
is  continuous  on  [0,1],  then  s^2  •+•  o2  with  probability  one. 

Proof:  We  proceed  as  in  the  proof  of  Theorem  2  to  write 


-  J 

i,n 


i  ] 

X  _  1  r  jt  f  .1 

|n  +  lj 

[rti  -  I(J  i 


-  «  -  J'(drT]>(drT  -  «J.  <  . 


where 


^  <  n  a  j  <  ^*  and  in  particular  n  . .  -  — 4t  • 

n  —  nji  —  n  Y  nil  n  +  1 


The  proof  is  then  concluded  by  showing  i)  -  IS,  -►  /  Q(u)J(u)du 

n  i-1  i,n  0 


!  n  l 

and  ii)  —  IS2  -*■  a2  +  (  /  Q(u)  J(u)du)2,  both  convergences 
n  i-1  1,n  0 

being  with  probability  one.  Proceeding  with  the  first  part. 
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ill"1*"  n  ill'  ln  +  J*1*0  "  n  j-lJ  ln  +  LJ  in+1  "  nJ  XJ*n 


-  n  -  n 

‘  n  1  n  E 

n  1-1  Q  j-1  Qji 


*  •  I(J-1)]XJ,n- 


The  first  term  converges  to  /  Q(u)J(u)du  with  probability 

0 

one  by  Theorem  4,  example  1  of  Wellner  (1977a), 

1  1  n  4  4 

The  second  term  is - {—  I  J'  — r  — “J— r  X,  }  which 

n  n  (n  +  1J  n  +  1  j,n 

converges  to  0  with  probability  one  by  the  same  result  (since 


-  Z  J'f — r-]  — X.  converges 
n  i-1  1“  +  n  +  1  j,n 


a 

/  Q(u)uJ'(u)du  with  probability  one).  Lastly, 


n  n 

-  z  £  e  CJ'(n  .t)  -  Jf 

n  i-l  n  j-l  nji 


f-4rl)  f-4r  -  Kj  <  i)lx, 

(n+lj '  jn+1  -  'J  j,n 


<  .Up|j'(rnji)  -  j’^) 

If  J 


which  converges  to  zero  with  probability  one  using  the  uniform 
continuity  of  J'  and  the  first  moment  assumption  on  F.  For 
the  second  part. 


”  * "  j1IJ(sr)xi.»  -  n  **  (&]{;&  - I(J  i  “h.n 


-  r  i  (J’(nn1i) 


■ :  j1tJ(sr]xi,n  -  j 


-  2 


-  Z  [jf— X  --  z  J’ f-^r-|(-jr-  -  I(j  <  i)\x 

n  i-1  ln+1J  1»n  n  j-i  l.n+lj\n+l  VJ-  j  j 


“  2  U'(nn<1> 
n  im±  nji 


1  ?  I1  rnf  *  T-f  1  1  . 1- 

“  i-i1"  j-i.  ' W  ' J  p+rj'lSi' 


^ln  +  A2n  +  A3n  * 


The  third  term,  A^,  is  less  than  or  equal  to  (in  absolute 


value) 


s*u;iJ’<’W)  -  j’(s3t)i  x.,xi|)2  • 

3  1*1 


which  converges  with  probability  one  to  zero  via  the  strong 
law  of  large  numbers  and  the  continuity  condition  on  J' .  The 
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second  tens,  A2n*  is  disposed  of  in  similar  fashion.  The 
first  term,  A^q,  requires  a  more  extended  analysis. 


i  n 

A.  -  -  Z  J2 
“In  n  i.1 


vn+1, 


C  2  + 1 j  ri  ;  j.ll 

i,n  n  lBl[n  j.1‘J  (n+l 


n+l] jn+T  “  I(J 


-  “  '  »  J/' 


n  t  ,  ^  t 


n+l 


,n+1, 


-  I(j 


-‘’h. 


■  B,  +  B.  +  B_  . 
In  2n  3n 


B i  "*  f  Q2(u)J2(u)du  with  probability  one  by  Kellner' s  Theorem  4. 
ln  0 

By 'a  similar  argument  and  rearrangement  si  terms. 


B2n  «*■  (  /  Q(u)  J'  (u)udu)  -2  /  uJ' (u)Q(u)du  •  /  J' (u)  (1  -  u)Q(u)du 


0 

.2, 


1  u 

+  J  (  /  Q(v) J* (v)dv)  du 
0  0 


and 


1  1 

B-  ■*>  -2  /  Q(u)J(u)du  •  /  Q(u)J'(u)udu 

Jn  0  0 

-  2  /  Q(u)J(u) [  /  Q(v)J'(v)dv|du  , 

0  Vn  J 
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both  convergences  holding  with  probability  one.  The  result 

then  follows  by  integrating  the  expression  for  a 2  given  by  (2.1) 

by  parts  and  observing  that  the  quantity  to  which  s^2  converges 

with  probability  one  is  indeed  a2.  (Note  that  the  appeals  to 

Wellner's  result  are  actually  to  a  slight  modification  of  it 

allowing  random  J  which  satisfy  the  boundedness  and  convergence 
n 

conditions  with  probability  one.) 

Example  1:  The  sample  mean,  for  which  J(u)  =  1,  clearly 
satisfies  the  conditions  of  Theorem  1  (but  not  the  trimming 

2j 

conditions  of  Theorem  2)  if  E_|x|  3  <  <r>.  In  fact,  S  =  S 

n  r  n  n 

£  -  x)2 

in  this  case,  and  sp2  p  j  ^ —  ■Xgp2  if  E^fxl2  <  *. 

Theorem  3  requires  | X |  <  «  for  some  e  >  0.  Thus,  the 

usual  strong  law  for  the  sample  variance  "just  falls"  to 
be  a  corollary  of  Theorem  3. 


Example  2:  While  the  ordinary  trimmed  means  do  not  satisfy 
the  score  conditions  of  Theorems  1  and  2,  a  smoothed  version 
causing  J  to  return  to  zero  in  such  a  way  that  it  is  differ¬ 
entiable  with  J'  obeying  a  Holder  condition  with  a  >  y  would 
satisfy  those  conditions.  We  assume  that  the  modified  score 
is  also  zero  on  C(e) . 


Example  3:  Gini's  mean  difference  (J(u)  ■  u  -  y)  and  the 

optimal  score  for  location  estimation  for  a  logistic  population 

(J(u)  ■  6u(l  -  u))  clearly  satisfy  the  conditions  of  Theorem  1 

if  Ep(X)  <  •  and  those  of  Theorem  3  if  E|x|2+C  <  •,  but  violate 

the  trimming  conditions  of  Theorem  2. 

In  most  instances.  Theorems  2  or  3  will  be  of  primary 

interest,  providing  methods  for  the  construction  of  approximate 

confidence  Intervals  based  upon  L-statistics  (and  one-to-one 

functions  thereof).  The  bias  of  an  L-statlstlc  will  often  be 

small  (but  see  Section  3  in  this  regard) ,  and  hence  S  will 

n 

be  of  limited  practical  use  for  the  purposes  of  estimation 

with  reduced  bias.  However,  the  end  goal  of  an  analysis  may 

be  to  estimate  or  construct  approximate  tests  or  confidence 

Intervals  for  g(S  ) .  For  estimation,  g(S  )  may  suffer  from 
o  n 

severe  bias  if  g  Is  highly  non-linear  near  Sq  (recall  that 

g"(so) 

Ey[g(Sn)]  =  g(So)  + — 2 - Varp(Sn)  ),  Hence  Jackknifing 

g(sn)  would  be  of  interest  in  such  cases  for  reduction  of  bias. 
If  g  is  non-monotone,  confidence  Intervals  for  g(Sft)  obtained  by 
finding  a  confidence  interval  for  S  using  /n(S  -  S  )/s  as 

O  Q  O  P 

an  approximately  standard  normal  pivotal  quantity  and  taking 
the  image  of  such  an  interval  under  g  may  well  result  in 
longer  intervals  than  would  be  obtained  by  pivoting  about  g(Sn) • 
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Thus,  it  seems  to  be  of  independent  interest  to  study  the 
behavior  of  g(Sn>  under  jackknifing.  The  following  theorem 
parallels  Theorem  1,  establishing  that  g(SQ)  and  its  jackknife 
have  the  same  limiting  distribution. 

Theorem  4:  Let  X^,  X^,  ...,  X^  be  a  random  sample  of  size 

n  from  a  distribution  F  with  E[|x|  ^3]  <  •.  Let  J'  obey  a 
••  ^ 

Holder  condition  with  a  >  y .  If  g  is  a  function  with  a  bounded 

second  derivative  in  a  neighborhood  of  S  ,  and  g(S  )  » 

o  n 

ng(S  )  -  - -  I  g(S(iJ)  ,  the  jackknife  of  g(S  ),  then 

n  n  n-l  n 

S 

/n(g(S  )  -  g(S  ))  —►0  with  probability  one. 
u  n 


Proof: 

/n(g(S  )  -  g(S  ))  -  /n  <-n.  1}  Z  [g'(S  )(S^]  -  S  > 

n  n  q  .  ,  n  n*x  n 

i»i 


*7  *"«!»> 
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1  J 

1  1 

i 

n(n  -  1)  J 

n  +  lw 

n(n  +  1)  (n  -  1) J 

n  +  1 

iH 


n(n  +  1)  (n  -  1)  "  J' 


ij 

a 


X3.. 


n  +  11  i,n 


So 


sup  |Sn-l  “  SJ  -  1  ^Xj  nl  +  n  sup  lJ(u)l(lXn  J  +IX1  J> 

l<i<n  n  1  n  j-ln  J,n  n  0<u<l  n,Tt  L,a 


with  M  determined  by  the  bounds  on  J  and  J' .  Finally,  this 

expression  converges  to  zero  with  probability  one  by  the  strong 

law  of  large  numbers  and  the  fact  that  max(|x,  |,|x  |)/n  -*•  0 

1  l,n'  1  n,n' 

with  probability  one  if  Ep|x|  <  ■>. 

The  first  term  in  (2.5)  converges  to  zero  with  probability  one  by 
Theorem  1.  The  second  term,  denoted  by  R^,  can  be  bounded  as  follows, 


lRJ  - 


,  )(S(iJ  -  S  )2 
in  n-1  n 


-  L  “275" 


n-1 


Z(Sn1l  ‘  Sn)2 
i-1  n"1  n 


where  |g"(a)|  <.  L  for  a  in  some  neighborhood  of  Sq,  for  n 
sufficiently  large. 
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derivative  in  a  neighborhood  of  So*  If 


.(i) 


S<*>  *  <n-1)if1'8(Sn-i>  -  *«„»  ■  «•« 


then  Sp(gj  ”*■  (g*  (Sn))2cr2,  with  a2  given  by  (2.1), 


Proof:  The  proof  proceeds  by  second-order  Taylor  expansion 

of  the  about  S  ,  and  then  follows  the  method  of 

n-jL  n 

Theorem  2. 

It  should  be  noted  that  Theorem  2  (l)  is  (is  not)  a  special 
case  of  Theorem  5  (4)  with  g(*)  the  identity,  due  to  the 
identical  (additional)  conditions  imposed  on  F  and  J  in  tne 
latter  theorem. 

If  the  conditions  for  both  Theorems  2  and  4  are  satisfied, 

asymptotically  pivotal  quantities  for  the  construction  of 

confidence  intervals  for  g(S  )  include 

o 


and 


'In 


^n(g(Sn)  -  g(SQ)) 


*P(g> 


Z2n‘ 


/n(g(Sn)  -  g(So» 

g’(S)s 
n  p 
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If  conditions  for  Theorems  I  and  3  hold,  g(SQ)  could  be 

replaced  by  g(S  )  and  g' (S  )  byg'(S  )(or  even  g'(S  )  if  g  has 
n  n  n  n 

a  bounded  third  derivative  in  a  neighborhood  of  S  ) .  The 

o 

question  naturally  arises  as  to  which  choices  would  be  best 
in  moderate-size  sample  applications  of  the  above  results. 

The  issue  of  what  quantity  to  jackknife,  that  is,  whether  to 
use 

/n(S  -  So)  /n(g(S  )  -  g(S  )) 

- - -  or  - - - ; - : -  , 

p  p(g) 

has  not  been  addressed  in  a  systematic  fashion  in  the  literature. 

However,  the  common  suggestions  of  Miller  (1974),  p.12)  and 

Efron  (1979,  p.1920)  that  the  function  tc  be  jackknifed  ehould 

be  variance  stabilized  are  reasonable,  for  example  tanh^r  does 

jackknife  more  satisfactorily  than  r,  the  ordinary  sample  correlation 

coefficient.  Using  this  advice,  when  estimating  a  function  g 

of  a  location  parameter  S0,  the  location  parameter  estimate  Sq  should 

itself  be  jackknifed  to  produce  a  variance  estimator.  In  other  words, 

is  recommended . 

2n 

3.  TRIAL  BY  NUMBERS 

While  the  above  results  give  a  large-sample  justification 
for  use  of  the  jackknife  method  for  the  creation  of  approximate 
confidence  intervals,  they  leave  unaddressed  questions 


26 


regarding  appropriateness  of  the  technique  in  small-sample 

situations.  Accordingly,  a  modest  Monte  Carlo  study  is  in 

order  both  i)  to  relate  the  large-sample  theory  to  samples  of 

-  a  size  likely  to  be  encountered  in  practice,  and  ii)  to 

explore  the  behavior  of  the  jackknife  for  L-statistics  whose 

score  functions  violate  one  or  more  of  the  regularity  conditions 

of  Theorems  1-4.  All  computations  were  performed  on  the 

AMDAHL  470  V/6  at  Texas  A&M  University. 

Only  location  parameter  estimation  is  considered,  that  is 

1 

J(u)  0  for  0  <  u<  1  and  /  J(u)du  ■  1.  The  distributions 

0 

considered  are  i)  N(5,l),  a  normal  with  mean  5  and  variance 
one,  ii)  a  logistic  with  mean  5  and  scale  parameter  1,  with 

F(x)  »  [1  +  e“^X-5^  J*1  ,  -oo  <  x  <  °°  , 

and  lii)  u(4,6),  a  uniform  distribution  on  the  interval 
(4,6).  Five  hundred  random  samples  of  sizes  5,  10,  20  and  40 
from  each  of  the  above  three  distributions  were  examined.  The 


score  functions  considered  are 


0  <  u  <  .05 


•  Jx(u)  «  0 

-  23.53  (u  -  .05) 

-  1.1765 

-  23.53  (.95  -  u) 

-  0 

a  "smoothly"  trimmed  mean  designed  to  obey  the  trimming 
conditions  of  the  theorems  but  to  fail  to  be  differentiable 
at  some  points  in  [0,1]; 

^(u)  *  4u  0  <_  u  <  .5 

■  4(1  -  u)  .5  <.  u  <_  1.0, 

a  "triangular"  weight  function  neither  trimming  nor  being 
everywhere  differentiable;  and 

J^(u)  «  6u(l  -u)  0  £  u  _<  1  , 

a  score  function  meeting  all  differentiability  requirements 
but  failing  to  trim. 

Unfortunately,  even  with  the  above  three  symmetric  score 
functions  which  integrate  to  one  and  symmetric  parents, 
biases  can  result  due  to  the  fact  that 

—  Z  Jvf — 77  +  1.  Table  1  gives  values  of;  I  “tI 
n  i-i  Kln  +  XJ  n  i-i  Kln  + 


.05  <  u  <  .10 
.10  <_  u  <  .90 
.90  £  u  <  .95 
.95  <  u  <  1  , 


for  n  ■  19,  20,  39,  and  40  and  K  “  1,  2,  3.  Note  that  a  value 
of  1  corresponds  to  no  bias. 

(TABLE  1  ABOUT  HERE) 

In  fact,  often  such  biases  will  die  out  at  the  rate  of  0(l/n2) 

1  n  f  1  1 

(obtained  by  viewing  —  I  J[n  V'lJ  as  an  aPP^-ication  the  trape¬ 
zoidal  rule  in  approximating  the  integral  /  J(u)du  -  if  J" 

0 

is  bounded  and  continuous  the  error  is  0(l/n2)).  Hence,  the 
ordinary  jackknife  would  be  of  little  or  no  use  in  dealing 
with  these  biases.  Also,  a  practitioner  using  L-statistics 
would  doubtlessly  use  the  modified  L-statlstic 


to  guarantee  unbiasedness.  All  of  Theorems  1-5  continue  to 
hold  for  these  modified  Sq*  if  they  hold  for  Sn>  so  long  as 
lim  n^(l  -  —  Z  j[ — — J)  ■  0.  Based  upon  the  500  samples 

n-  “  i.i  l”  +  !J 

for  each  sample  size  and  distribution  combination  the 
following  are  estimated: 


29 


1.  Bias  Factors  for  L-Statistlcs 


n 

J1 

J2 

J3 

19 

1.053 

1.053 

1.050 

20 

1.048 

1.048 

1.048 

39 

1.026 

1.026 

1.026 

40 

1.024 

1.024 

1.024 

a)  Variance  of  Sft* 

b)  variance  of  S  *,  the  jackknife  of  S  * 

n  n 

c)  Mean  of  s  2/n,  the  jackknife  variance  estimator 

P 

d)  Mean  of  s  // n,  the  jackknife  standard  deviation 

P 

estimator 

e-f)  Percent  coverages  of  approximate  100(1  -  a)X 

1 

confidence  intervals  for  /  Q(u)J(u)du,  obtained  as 

0 

S  *  +  t  s  // n  ,  where  t  is  the 

n  -  .  a  „  ,  p'  n_, 

1“  2"  »n-i  x  2  ,n  A 

100(1  -  percent  point  of  a  t-distribution  with 
n  -  1  degrees  of  freedom,  for  e)  o  ■  .10,  and 
f)  a  -  .05. 

and  g-h)  Percent  coverages  for  confidence  intervals  identical 
to  these  above,  but  centered  on  S^*,  for  g)  a  *  .10, 
and  h)  a  «  .05. 

Tables  2-5  present  the  results.  It  may  be  seen,  even  for 
samples  of  size  5,  that  the  approximate  confidence  intervals 
maintain  actual  confidence  coefficients  close  to  the  nominal 
90  and  95  percent  levels.  Typically,  the  worst  cases  of 
undercoverage  seem  to  occur  for,  curiously  enough,  the  uniform 
parent.  Intervals  centered  on  Sq*  seem  slightly  better 
in  this  regard  than  those  centered  on  SQ*  ,  although  the 


Normal  Parent  Logistic  Parent  Uniform  Parent 


Normal  Parent  Logistic  Parent  Uniform  Parent 


observed  'differences  are  of  the  same  order  as  the  standard 
errors  of  the  empirical  confidence  coefficients.  This  may 
be  related  to  the  typically  slightly  larger  standard  error 
of  Sn*.  Consistent  with  the  result  of  Efron  and  Stein  (1979) , 
the  jackknife  variance  estimator  appears  to  be  typically 
positively  biased  as  an  estimator  of  the  variance  of  Sq*. 

For  28  of  the  36  combinations  of  sample  size,  score,  and 
parent  population,  the  estimated  mean  of  the  jackknife  vari¬ 
ance  estimator  was  greater  than  or  equal  to  the  estimated 
variance  of  S  *.  Out  of  the  36  combinations,  S  *  had  a  larger 
estimated  variance  than  S n*  27  times,  with  3  ties  (to  4 
decimal  places).  However,  the  increase  was  tynl rally  small 
relative  to  the  size  of  the  estimates  themselves.  Interestingly, 

the  estimated  bias  of  s  //n  as  an  estimate  of  the  standard 

P 

error  of  S^*  is  small,  perhaps  indicating  that  while  the 

variance  estimate  may  suffer  from  a  positive  bias,  the  standard 

error  estimate  is  relatively  better  off.  (Parenthetically, 

if  s  //n  were  exactly  unbiased  for  (Var(S  *))  ^ 2 ,  then  s  2/n 
p  n  p 

would  have  a  bias  of  order  1/n2,  assuming  Var(S  *)  *  0(l/n).) 

n 

4 .  APPLICATIONS 

The  large-sample  results  of  Section  2,  bolstered  by  the 
favorable  Monte  Carlo  results  in  Section  3,  provide  a  met ho- 
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dology  for  robust  inference  in  linear  models,  in  particular  for 
completely  randomized  designs  with  multiple  observations  per 
treatment.  We  consider  the  model 


V  + 


°i  +  cij 


t 


i  -  1,  2,  t 

j  *  1»  2,  . . . ,  n^  , 


where  represents  the  "effect"  of  the  ith  treatment.  Note 

that  ve  do  not  rule  out  a  factorial  structure  for  the  t 

treatments.  If,  Instead  of  the  usual  assumptions  that  the 

are  normally  and  independently  distributed  with  mean  0  and 

common  variance,  ve  merely  assume  that  the  are  independently 

and  identically  distributed,  symmetrically  about  0,  we  can 

pursue  an  L-statistic  approach  to  analysis  of  variance 

(the  symmetry  assumption  is  merely  convenient  -  not  necessary) . 

If  we  assume  mln(n^,  n2»  ...»  nfc)  -*>  •  and  min(n^,  n2*  ...»  nt)/ 
t 

E  n.  a  >  0  ,  then  for  a  symmetric  score  function  J(«) 
i-1 

meeting  the  conditions  of  the  appropriate  theorems  in  Section  2, 


■  i  rV- 


^K»t 


we  denote 


is  Che 


and  s  2  Che  corresponding  variance  escimace,  where  X. 

1  J  »n^ 

jCh  order  sCacisCic  among  chose  receiving  creacmenc  i.  We  Chen 

pool  our  variance  escimaces  by 


c 

I  (n  -  l)s  2 
i-1  1  1 


c 

Z  (n.  -  1) 
i-1  1 


mimicking  ordinary  analysis  of  variance.  Then,  an  approximate 

n  n 

Cesc  for  a  conCrasC  Z  C. a.  (  I  C  ■  0)  is  provided  by  rejeccing 

i-1  1  1  i-1  1 

n 


Che  null  hypothesis  (  E  C.e.  -  0)  if  and  o«lv  if 

i-1  1  * 


Z  C.SJ 
i-1  x 


^  t  C,2 

22 

1-0/2 


>Z*  a2  Z  — 


i-1  ni 


with  t*ie  1  ”  J  quantile  for  the  standard  normal.  An 

obvious  modification  in  the  critical  point  would  permit 
Scheffe-type  procedures.  Similarly,  "robust"  multiple 
comparisons  could  be  done.  Robustness  of  these  procedures  would, 
of  course,  depend  upon  the  robustness  and  convergence  rates 
of  the  associated  L-statistic  for  location  -  a  well-studied 
topic.  Note  that  the  "sums  of  squares"  for  this  type  of  analysis 
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can  be  simply  computed  by  separately  computing  S*  and  s^2  for 
each  treatment,  and  then  inputting  these  into  any  standard 
analysis  of  variance  package  which  will  accept  treatment  means 
and  variances  as  input. 


5.  SUMMARY 

The  ordinary  jackknife  is  a  computationally  simple  means 

for  the  construction  of  large-sample  confidence  intervals 

1 

for  functionals  of  the  form  Sq  *  /  Q(u)J(u)du.  Simulation 

results  indicate  that  the  technique  is  effective  for  small 
samples. 
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sistent  under  moderate  smoothness  and  a  trimming  condition  of  the  weight 
function.  The  results  extend  easily  to  functions  of  L-statistics  and  thus 
answer  a  question  posed  in  Miller  (1974) .  Monte  Carlo  results  support  small- 
sample  applicability  of  the  large-sample  results  for  the  construction  of 
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